The following project is about Public health data. This includes information about the food items available in different countries, different marks, the food grade, ingredients etc. The link to open food data set - https://world.openfoodfacts.org/
| code | url | creator | created_t | created_datetime | last_modified_t | last_modified_datetime | product_name | generic_name | quantity | ... | ph_100g | fruits-vegetables-nuts_100g | collagen-meat-protein-ratio_100g | cocoa_100g | chlorophyl_100g | carbon-footprint_100g | nutrition-score-fr_100g | nutrition-score-uk_100g | glycemic-index_100g | water-hardness_100g | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3087 | http://world-fr.openfoodfacts.org/produit/0000... | openfoodfacts-contributors | 1474103866 | 2016-09-17T09:17:46Z | 1474103893 | 2016-09-17T09:18:13Z | Farine de blé noir | NaN | 1kg | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | 4530 | http://world-fr.openfoodfacts.org/produit/0000... | usda-ndb-import | 1489069957 | 2017-03-09T14:32:37Z | 1489069957 | 2017-03-09T14:32:37Z | Banana Chips Sweetened (Whole) | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | 14.0 | 14.0 | NaN | NaN |
| 2 | 4559 | http://world-fr.openfoodfacts.org/produit/0000... | usda-ndb-import | 1489069957 | 2017-03-09T14:32:37Z | 1489069957 | 2017-03-09T14:32:37Z | Peanuts | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 0.0 | NaN | NaN |
3 rows × 162 columns
| code | url | creator | created_t | created_datetime | last_modified_t | last_modified_datetime | product_name | generic_name | quantity | ... | ph_100g | fruits-vegetables-nuts_100g | collagen-meat-protein-ratio_100g | cocoa_100g | chlorophyl_100g | carbon-footprint_100g | nutrition-score-fr_100g | nutrition-score-uk_100g | glycemic-index_100g | water-hardness_100g | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 320769 | 9970229501521 | http://world-fr.openfoodfacts.org/produit/9970... | tomato | 1422099377 | 2015-01-24T11:36:17Z | 1491244499 | 2017-04-03T18:34:59Z | 乐吧泡菜味薯片 | Leba pickle flavor potato chips | 50 g | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 320770 | 9980282863788 | http://world-fr.openfoodfacts.org/produit/9980... | openfoodfacts-contributors | 1492340089 | 2017-04-16T10:54:49Z | 1492340089 | 2017-04-16T10:54:49Z | Tomates aux Vermicelles | NaN | 67g | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 320771 | 999990026839 | http://world-fr.openfoodfacts.org/produit/9999... | usda-ndb-import | 1489072709 | 2017-03-09T15:18:29Z | 1491244499 | 2017-04-03T18:34:59Z | Sugar Free Drink Mix, Peach Tea | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 rows × 162 columns
(320772, 162)
<class 'pandas.core.frame.DataFrame'> RangeIndex: 320772 entries, 0 to 320771 Columns: 162 entries, code to water-hardness_100g dtypes: float64(106), object(56) memory usage: 396.5+ MB
code 0.000072
url 0.000072
creator 0.000006
created_t 0.000009
created_datetime 0.000028
...
carbon-footprint_100g 0.999165
nutrition-score-fr_100g 0.310382
nutrition-score-uk_100g 0.310382
glycemic-index_100g 1.000000
water-hardness_100g 1.000000
Length: 162, dtype: float64
| %null | Col | |
|---|---|---|
| water-hardness_100g | 1.0 | water-hardness_100g |
| no_nutriments | 1.0 | no_nutriments |
| ingredients_that_may_be_from_palm_oil | 1.0 | ingredients_that_may_be_from_palm_oil |
| nutrition_grade_uk | 1.0 | nutrition_grade_uk |
| nervonic-acid_100g | 1.0 | nervonic-acid_100g |
| %null | Col | |
|---|---|---|
| fiber_100g | 0.373742 | fiber_100g |
| serving_size | 0.341180 | serving_size |
| nutrition-score-uk_100g | 0.310382 | nutrition-score-uk_100g |
| nutrition-score-fr_100g | 0.310382 | nutrition-score-fr_100g |
| nutrition_grade_fr | 0.310382 | nutrition_grade_fr |
| code | url | creator | created_t | created_datetime | last_modified_t | last_modified_datetime | product_name | brands | brands_tags | ... | fat_100g | saturated-fat_100g | carbohydrates_100g | sugars_100g | fiber_100g | proteins_100g | salt_100g | sodium_100g | nutrition-score-fr_100g | nutrition-score-uk_100g | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3087 | http://world-fr.openfoodfacts.org/produit/0000... | openfoodfacts-contributors | 1474103866 | 2016-09-17T09:17:46Z | 1474103893 | 2016-09-17T09:18:13Z | Farine de blé noir | Ferme t'y R'nao | ferme-t-y-r-nao | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | 4530 | http://world-fr.openfoodfacts.org/produit/0000... | usda-ndb-import | 1489069957 | 2017-03-09T14:32:37Z | 1489069957 | 2017-03-09T14:32:37Z | Banana Chips Sweetened (Whole) | NaN | NaN | ... | 28.57 | 28.57 | 64.29 | 14.29 | 3.6 | 3.57 | 0.000 | 0.00 | 14.0 | 14.0 |
| 2 | 4559 | http://world-fr.openfoodfacts.org/produit/0000... | usda-ndb-import | 1489069957 | 2017-03-09T14:32:37Z | 1489069957 | 2017-03-09T14:32:37Z | Peanuts | Torn & Glasser | torn-glasser | ... | 17.86 | 0.00 | 60.71 | 17.86 | 7.1 | 17.86 | 0.635 | 0.25 | 0.0 | 0.0 |
3 rows × 34 columns
| additives_n | ingredients_from_palm_oil_n | ingredients_that_may_be_from_palm_oil_n | energy_100g | fat_100g | saturated-fat_100g | carbohydrates_100g | sugars_100g | fiber_100g | proteins_100g | salt_100g | sodium_100g | nutrition-score-fr_100g | nutrition-score-uk_100g | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 248939.000000 | 248939.000000 | 248939.000000 | 2.611130e+05 | 243891.000000 | 229554.000000 | 243588.000000 | 244971.000000 | 200886.000000 | 259922.000000 | 255510.000000 | 255463.000000 | 221210.000000 | 221210.000000 |
| mean | 1.936024 | 0.019659 | 0.055246 | 1.141915e+03 | 12.730379 | 5.129932 | 32.073981 | 16.003484 | 2.862111 | 7.075940 | 2.028624 | 0.798815 | 9.165535 | 9.058049 |
| std | 2.502019 | 0.140524 | 0.269207 | 6.447154e+03 | 17.578747 | 8.014238 | 29.731719 | 22.327284 | 12.867578 | 8.409054 | 128.269454 | 50.504428 | 9.055903 | 9.183589 |
| min | 0.000000 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000 | 0.000000 | 0.000000 | -17.860000 | -6.700000 | -800.000000 | 0.000000 | 0.000000 | -15.000000 | -15.000000 |
| 25% | 0.000000 | 0.000000 | 0.000000 | 3.770000e+02 | 0.000000 | 0.000000 | 6.000000 | 1.300000 | 0.000000 | 0.700000 | 0.063500 | 0.025000 | 1.000000 | 1.000000 |
| 50% | 1.000000 | 0.000000 | 0.000000 | 1.100000e+03 | 5.000000 | 1.790000 | 20.600000 | 5.710000 | 1.500000 | 4.760000 | 0.581660 | 0.229000 | 10.000000 | 9.000000 |
| 75% | 3.000000 | 0.000000 | 0.000000 | 1.674000e+03 | 20.000000 | 7.140000 | 58.330000 | 24.000000 | 3.600000 | 10.000000 | 1.374140 | 0.541000 | 16.000000 | 16.000000 |
| max | 31.000000 | 2.000000 | 6.000000 | 3.251373e+06 | 714.290000 | 550.000000 | 2916.670000 | 3520.000000 | 5380.000000 | 430.000000 | 64312.800000 | 25320.000000 | 40.000000 | 40.000000 |
(320772, 34)
| additives_n | ingredients_from_palm_oil_n | energy_100g | saturated-fat_100g | carbohydrates_100g | sugars_100g | fiber_100g | proteins_100g | salt_100g | nutrition-score-fr_100g | |
|---|---|---|---|---|---|---|---|---|---|---|
| count | 248939.000000 | 248939.000000 | 2.611130e+05 | 229554.000000 | 243588.000000 | 244971.000000 | 200886.000000 | 259922.000000 | 255510.000000 | 221210.000000 |
| mean | 1.936024 | 0.019659 | 1.141915e+03 | 5.129932 | 32.073981 | 16.003484 | 2.862111 | 7.075940 | 2.028624 | 9.165535 |
| std | 2.502019 | 0.140524 | 6.447154e+03 | 8.014238 | 29.731719 | 22.327284 | 12.867578 | 8.409054 | 128.269454 | 9.055903 |
| min | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000 | 0.000000 | -17.860000 | -6.700000 | -800.000000 | 0.000000 | -15.000000 |
| 25% | 0.000000 | 0.000000 | 3.770000e+02 | 0.000000 | 6.000000 | 1.300000 | 0.000000 | 0.700000 | 0.063500 | 1.000000 |
| 50% | 1.000000 | 0.000000 | 1.100000e+03 | 1.790000 | 20.600000 | 5.710000 | 1.500000 | 4.760000 | 0.581660 | 10.000000 |
| 75% | 3.000000 | 0.000000 | 1.674000e+03 | 7.140000 | 58.330000 | 24.000000 | 3.600000 | 10.000000 | 1.374140 | 16.000000 |
| max | 31.000000 | 2.000000 | 3.251373e+06 | 550.000000 | 2916.670000 | 3520.000000 | 5380.000000 | 430.000000 | 64312.800000 | 40.000000 |
| code | brands | countries_tags | additives_n | ingredients_from_palm_oil_n | nutrition_grade_fr | energy_100g | saturated-fat_100g | carbohydrates_100g | sugars_100g | fiber_100g | proteins_100g | salt_100g | nutrition-score-fr_100g | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 4530 | NA | en:united-states | 0.0 | No | d | 2243.0 | 28.57 | 64.29 | 14.29 | 3.6 | 3.57 | 0.00000 | 14.0 |
| 1 | 4559 | Torn & Glasser | en:united-states | 0.0 | No | b | 1941.0 | 0.00 | 60.71 | 17.86 | 7.1 | 17.86 | 0.63500 | 0.0 |
| 2 | 16087 | Grizzlies | en:united-states | 0.0 | No | d | 2540.0 | 5.36 | 17.86 | 3.57 | 7.1 | 17.86 | 1.22428 | 12.0 |
| additives_n | energy_100g | saturated-fat_100g | carbohydrates_100g | sugars_100g | fiber_100g | proteins_100g | salt_100g | nutrition-score-fr_100g | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.0 | 2243.0 | 28.57 | 64.29 | 14.29 | 3.60 | 3.570 | 0.00000 | 14.0 |
| 1 | 0.0 | 1941.0 | 0.00 | 60.71 | 17.86 | 7.10 | 17.860 | 0.63500 | 0.0 |
| 2 | 0.0 | 2540.0 | 5.36 | 17.86 | 3.57 | 7.10 | 17.860 | 1.22428 | 12.0 |
| 3 | 2.0 | 1833.0 | 4.69 | 57.81 | 15.62 | 9.40 | 14.060 | 0.13970 | 7.0 |
| 4 | 1.0 | 2230.0 | 5.00 | 36.67 | 3.33 | 6.70 | 16.670 | 1.60782 | 12.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 170705 | 5.0 | 1031.0 | 1.28 | 95.31 | 0.10 | 1.47 | 0.004 | 0.00100 | 2.0 |
| 170706 | 1.0 | 1393.0 | 2.78 | 61.11 | 30.56 | 8.30 | 5.560 | 0.95250 | 11.0 |
| 170707 | 0.0 | 1477.0 | 0.00 | 87.06 | 2.35 | 4.70 | 1.180 | 0.03048 | -1.0 |
| 170708 | 0.0 | 21.0 | 0.20 | 0.50 | 0.50 | 0.20 | 0.500 | 0.02540 | 2.0 |
| 170709 | 0.0 | 0.0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.000 | 0.00000 | 0.0 |
170710 rows × 9 columns
| additives_n | energy_100g | saturated-fat_100g | carbohydrates_100g | sugars_100g | fiber_100g | proteins_100g | salt_100g | nutrition-score-fr_100g | |
|---|---|---|---|---|---|---|---|---|---|
| count | 170710.000000 | 1.707100e+05 | 170710.000000 | 170710.000000 | 170710.000000 | 170710.000000 | 170710.000000 | 170710.000000 | 170710.000000 |
| mean | 1.967916 | 1.206842e+03 | 4.635023 | 34.584268 | 14.994453 | 2.865459 | 7.750171 | 1.375087 | 8.796579 |
| std | 2.517014 | 7.903058e+03 | 6.980362 | 28.227493 | 19.421789 | 4.403977 | 7.953934 | 14.621370 | 9.077207 |
| min | 0.000000 | 0.000000e+00 | 0.000000 | 0.000000 | -17.860000 | 0.000000 | -3.570000 | 0.000000 | -15.000000 |
| 25% | 0.000000 | 4.520000e+02 | 0.000000 | 8.000000 | 1.500000 | 0.000000 | 2.100000 | 0.116840 | 1.000000 |
| 50% | 1.000000 | 1.218000e+03 | 1.670000 | 26.670000 | 5.300000 | 1.600000 | 5.670000 | 0.678180 | 9.000000 |
| 75% | 3.000000 | 1.745000e+03 | 6.800000 | 60.200000 | 23.330000 | 3.600000 | 10.710000 | 1.361440 | 16.000000 |
| max | 31.000000 | 3.251373e+06 | 210.000000 | 209.380000 | 134.000000 | 178.000000 | 100.000000 | 3048.000000 | 40.000000 |
| nutrition_a | nutrition_b | nutrition_c | nutrition_d | nutrition_e | |
|---|---|---|---|---|---|
| nutrition_count | |||||
| 0 | 1096.0 | 1941.0 | 1833.0 | 2243.0 | 2092.0 |
| 1 | 1887.0 | 1824.0 | 1674.0 | 2540.0 | 2197.0 |
| 2 | 1904.0 | 2632.0 | 1954.0 | 2230.0 | 1883.0 |
| 3 | 1749.0 | 1548.0 | 1941.0 | 1464.0 | 2197.0 |
| 4 | 159.0 | 1674.0 | 1904.0 | 2092.0 | 1569.0 |
89.92095963378982
1.806909715123044e-76
array([[-0.78184784, 1.40053847, 3.43791013, ..., -0.52556421,
-0.09403702, 0.57333893],
[-0.78184784, 1.00000986, -0.66552251, ..., 1.2710571 ,
-0.05060909, -0.96903005],
[-0.78184784, 1.79443582, 0.10431995, ..., 1.2710571 ,
-0.01030798, 0.3530005 ],
...,
[-0.78184784, 0.38462815, -0.66552251, ..., -0.82604881,
-0.09195248, -1.07919926],
[-0.78184784, -1.54639722, -0.63679704, ..., -0.91154233,
-0.09229991, -0.74869162],
[-0.78184784, -1.57424854, -0.66552251, ..., -0.97440522,
-0.09403702, -0.96903005]])
PCA(n_components=4)
array([0.31020206, 0.18258611, 0.153259 , 0.11133615])
array([0.31020206, 0.49278817, 0.64604718, 0.75738332])
array([ 0.10280092, 0.52236288, 0.38150724, 0.38947938, 0.39423053,
0.10101489, 0.09058342, -0.00297288, 0.4954706 ])
array([-0.3573938 , 0.21887959, 0.30557293, -0.29689876, -0.43302178,
0.30035385, 0.60365517, 0.02248405, 0.01457077])
array([-0.1720052 , 0.11640431, -0.40173669, 0.46440778, 0.10245973,
0.65433582, 0.0470006 , -0.06691056, -0.36668495])
array([ 9.81346784e-02, -1.89275569e-04, -8.22712122e-02, 4.76224266e-02,
-2.53504418e-02, 5.11189066e-02, 4.24901372e-02, 9.87970300e-01,
1.36597580e-02])
array([[-0.78184784, 1.40053847, 3.43791013, ..., -0.52556421,
-0.09403702, 0.57333893],
[-0.78184784, 1.00000986, -0.66552251, ..., 1.2710571 ,
-0.05060909, -0.96903005],
[-0.78184784, 1.79443582, 0.10431995, ..., 1.2710571 ,
-0.01030798, 0.3530005 ],
...,
[-0.78184784, 0.38462815, -0.66552251, ..., -0.82604881,
-0.09195248, -1.07919926],
[-0.78184784, -1.54639722, -0.63679704, ..., -0.91154233,
-0.09229991, -0.74869162],
[-0.78184784, -1.57424854, -0.66552251, ..., -0.97440522,
-0.09403702, -0.96903005]])
| additives_n | energy_100g | saturated-fat_100g | carbohydrates_100g | sugars_100g | fiber_100g | proteins_100g | salt_100g | nutrition-score-fr_100g | Category | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.0 | 2243.0 | 28.57 | 64.29 | 14.29 | 3.6 | 3.57 | 0.00000 | 14.0 | bad |
| 1 | 0.0 | 1941.0 | 0.00 | 60.71 | 17.86 | 7.1 | 17.86 | 0.63500 | 0.0 | good |
| 2 | 0.0 | 2540.0 | 5.36 | 17.86 | 3.57 | 7.1 | 17.86 | 1.22428 | 12.0 | worst |
| 3 | 2.0 | 1833.0 | 4.69 | 57.81 | 15.62 | 9.4 | 14.06 | 0.13970 | 7.0 | good |
| 4 | 1.0 | 2230.0 | 5.00 | 36.67 | 3.33 | 6.7 | 16.67 | 1.60782 | 12.0 | good |
Counter({2: 34462, 3: 32980, 0: 21238, 1: 64500, 4: 17506, 5: 8})